Priority-Based k-Anonymity Accomplished by Weighted Generalisation Structures
نویسندگان
چکیده
Biobanks are gaining in importance by storing large collections of patient’s clinical data (e.g. disease history, laboratory parameters, diagnosis, life style) together with biological materials such as tissue samples, blood or other body fluids. When releasing these patientspecific data for medical studies privacy protection has to be guaranteed for ethical and legal reasons. k-anonymity may be used to ensure privacy by generalising and suppressing attributes in order to release sufficient data twins that mask patients’ identities. However, data transformation techniques like generalisation may produce anonymised data unusable for medical studies because some attributes become too coarse-grained. We propose a priority-driven anonymisation technique that allows to specify the degree of acceptable information loss for each attribute separately. We use generalisation and suppression of attributes together with a weighting-scheme for quantifying generalisation steps. Our approach handles both numerical and categorical attributes and provides a data anonymisation based on priorities and weights. The anonymisation algorithm described in this paper has been implemented and tested on a carcinoma data set. We discuss some general privacy protecting methods for medical data and show some medical-relevant use cases that benefit from our anonymisation technique.
منابع مشابه
Achieving k-Anonymity by Clustering in Attribute Hierarchical Structures
Individual privacy will be at risk if a published data set is not properly de-identified. k-anonymity is a major technique to de-identify a data set. A more general view of k-anonymity is clustering with a constraint of the minimum number of objects in every cluster. Most existing approaches to achieving k-anonymity by clustering are for numerical (or ordinal) attributes. In this paper, we stud...
متن کاملAchieving k-anonymity Using Improved Greedy Heuristics for Very Large Relational Databases
Advances in data storage, data collection and inference techniques have enabled the creation of huge databases of personal information. Dissemination of information from such databases even if formally anonymised, creates a serious threat to individual privacy through statistical disclosure. One of the key methods developed to limit statistical disclosure risk is k-anonymity. Several methods ha...
متن کاملEnhancing Informativeness in Data Publishing while Preserving Privacy using Coalitional Game Theory
k-Anonymity is one of the most popular conventional techniques for protecting the privacy of an individual. The shortcomings in the process of achieving k-Anonymity are presented and addressed by using Coalitional Game Theory (CGT) [1] and Concept Hierarchy Tree (CHT). The existing system considers information loss as a control parameter and provides anonymity level (k) as output. This paper pr...
متن کاملA Semantic-Based K-Anonymity Scheme for Health Record Linkage
Record linkage is a technique for integrating data from sources or providers where direct access to the data is not possible due to security and privacy considerations. This is a very common scenario for medical data, as patient privacy is a significant concern. To avoid privacy leakage, researchers have adopted k-anonymity to protect raw data from re-identification however they cannot avoid as...
متن کاملOPTIMUM WEIGHTED MODE COMBINATION FOR NONLINEAR STATIC ANALYSIS OF STRUCTURES
In recent years some multi-mode pushover procedures taking into account higher mode effects, have been proposed. The responses of considered modes are combined by the quadratic combination rules, while using the elastic modal combination rules in the inelastic phases is not valid. Here, an optimum weighted mode combination method for nonlinear static analysis is presented. Genetic algorithm is ...
متن کامل